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1. Introduction . 

In this note, we discuss the qualitative behavior of the probability of 
misclassifylng observations ±n from two normally distributed populations 

as the classification regions are varied in a prescribed way. This discussion 
is intended to provide a preliminary generalization of the results obtained 
by Walton [4] for the case of normally distributed observations in IR^ with 
varying a priori probabilities. We hope to provide quantitative analogues of 
these results in subsequent reports. 

We assume that observations from two populations ^ and ^ ^ are known 
to have a priori probabilities a and normal density functions 


^01 


(2t\) 


n/2 


l^oi! 


1/2 


i = 1.2 





T o 

for X = ) e iR » We further assume that these observations are 

classified, not by using the true Bayes optimal (maximum likelihood) classi- 
fication scheme for and but by using the Bayes optimal classification 

scheme defined by a priori probabilities ot^^Ct) and density functions 


p^(x,t) 


1 


-l/2(x-u.(t))^Z: (t) ^(x-y (t)) 

^ ^ ^ . 1 = 1 . 2 . 


where the functions cx^(t), p^(t) , and are continuously differentiable 

functions of the parameter t in a neighborhood of t « 0. This is to say that 
and observation x e is classified as coming from if and only if 

^^(t) p^(x,t) = PjCx,t). (We assume that p^(x,t) p 2 (x,t) as a 

function of x in a neighborhood of t = 0)» 

Under these assumptions, the probability of error in classifying an obser- 
vation is a function of t in a neighborhood of t = 0, given by 

Pe (t) = J + J ag^Poj^(x)dx, 

‘‘l(t) ‘^2(t) 

where the regions Rj^(t) and defined as follows. Let 

a-(t)p.(x,t) 

?(xat) » log 
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Ot, (t) I ^2 (t) I 1 T -1 1 T -1 

= log i ^ - i-(x-p,(t))^?:,(t) ^(x-y,(t)) + i(x-n,(t))'z-(t) ^(X-M (t)). 

a2(t)|Z2(t)|^/2 211 1222 2 

Then R.(t) = (x el5^";F(x,t) ^ 0} and R 2 ^*"^ “ £TR**!F(x»t) < O}. (For a 

more thorough disjrusslon of the probability of error, see Anderson [1].) 

Our goal is to examine qualitatively the rate at which P (t) varies 

e 

as t varies in a neighborhood of zero. In our main result, the exact rate of 

variance of p (t) Is seen to depend on a number of factors. However, an 
e 

inequality of the form 


|Pg (t) - P g (0)1 £ kM“ 

is obtained in every case. In other words, ^gCt) is always Holder continuous 
at t = 0. In the following, the exponent a is determined precisely in each 
case. The constant K is merely asserted to exist; no estimate of its size is 
given. Unfortunately, to Implement such an inequality in practice, one must know 
both the size of K and the range of t for which the inequality holds. 

In the sequel, large constants are denoted generically by K,K*, etc. 
Distinguished constants are subscripted. The common boundary of 
R 2 (t) is denoted by S(t). 

2. The variation of P (t). 

e 

Our objective is to prove the following theorem. 

Theorem: If VF(x,0) 0 on S(0), then there exists a constant K such that 
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jp (t) - P (0) I ^ K|t( for small t. If VF(x, 0) vanishes somewhere on 
S(0), then there exists a constant K such that |p^ (t) - (0) | < K|t|°^^ 

for small t, where m is the number of non-zero eigenvalues (counting multi- 
plicity) of - 2^(0)'^. 

Remarks; If VF(x,0) vanishes anywhere, then the assumption Pj^(x,p) ^ p2(*t0) 
implies that m > 0. Thus P^ (t) is Holder continuous at t — 0 with exponent 
at least y. In the special case in which a^(0) “ y^(0) = 

i « 1,2, exponents of Holder continuity larger than those specified above can 
be obtained. The determination of these exponents is not carried out here. 

Before beginning the proof of the theorem, we establish several lemmas. 

For a subset X£(R”, define 


d(x,X) 


inf 

y e X 


x-y 


if X ^ 0 
If X = 0. 


Let T = {x e £R ";VF(x,0) = O}. 

For non-negative p and q and positive r and s, define 


I X € lR.^:|x| Sj r|t|~^| , 

^p,q,r,s^^^ ^ ^ d(x,T) ^ sltl^^l , 

^ lR-*'*-|x| 5: rltT^ and d(x,T) < s|t|**| . 


When there is no danger of ambiguity, we will omit the subcrlpts p,q,r, and 
L (t) - L(t), etc. 

P j, F 
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Lemma 1 : Suppose that 0 < q < 1 and 0 < p < 1-q. 

Then there exists a constant K, independent of ptq,t, and s, such that, 
if t is sufficiently small, then jVF(x,t)| ^ Ks(t|^ for all x e M(t). 

Proof: Writing F(x,t) = x*^A(t)x + B(t)x + C(t), one obtains 7F(x,t) = 2A(t)x 

nr ^ j j O' 

B(t) and VF(x,t) = 2 ^^(t)x + ^ B(t) . From this, it is seen that there 

exist constants K* and K”, independent of p,q,r, and s, such that 
|VF(x,0)| i K*s|t|^ and | |^VF(x,0)|^ K”(l + r)|t| ^ for x e M(t) and t 
small. It follows that there exists a constant K, Independent of p,q,r, 
and s, such that, for x € M(t) , 

|VF(x,t)| ^ K'sltl*^ - K"C1 + r)|t|^“P ^ Ks|t|^ 


whenever t is small. 


Lemma 2: 

If t and 
and 0 “ p 
y(x,'c) of 


Suppose that 0 ^ q - ^ and 0 - 
^ are sufficiently small 
y - q, then, for jt^l < |t| 
the initial-value problem 



, or if t is sufficiently small 
and X € M(t) 0 SCt^), the solution 


jfy(x.T) 


■|^(x,x) 

|W(y.T)|2 


?F(y.T) 


y(x,tQ> =» X 


exists and is continuously differentiable in x and T for |t| < |t|. 
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Remarks: Note that, wherever y(x,x) exists, y(x,t) e S(t) and — ^(x,t) 

2 

Is normal to S(x), It is seen in the proof that |y(x,x) - x| ^ K — lt|^ ^ 

s 

where the constant K is independent of p»q»r,s, and t for small t. 

Proof: From Lemma 1 and the fact that is quadratic in x, one sees 

that, if t is sufficiently small, then there exists a constant K, Independent 
of p,q,r,s, and t, such that 

* It , K(H-r)^ 

!VF(x,t)| ^ ® 

for |x| < |t| and X € * Consequently, 

2 

|y(x,x) - x| ^ 2R |t|^ ^ whenever x lies in the domain of existence 

® 2 

of y(x,T). If — Is so small that 2K < r and 2K ^ , then, 

s s s 2 

since l~2p-q 5 : q and l-2p - q > -p, we have 

(1) |y(x,x) - xf < r |t| ^ and jy(x,x) " < |- |t|^ 

whenever x e M _ and T lies In the domain of existence of y(x,x). 

X 

If 0<p<y~q, then l-2p-q > q and l-2p-q > -p, and one easily verifies 
chat (1) again holds for small t. Thus (1) holds under the hypotheses of 
the lemma » 

Suppose that the hypotheses of the lemma are satisfied (so that (1) holds) 
and that there exists a t^, (t^l < jt*, and x e ^ S (t^) such 

that y(x,x) does not exist for all x,|x|^ |t|. Then one can find a t^, 
k^i < Itl, for which y(x.ti) € [2]. But this contradicts 



(1), and the lemnia is proved. 


T ^ 11 

Lemma 3 ; Suppose that 0 ^ q 5 — and 0 < p ^ q. 


If t and 


1+r 


are sufficiently small, or If t is sufficiently small and 


0 < p < y - q, then 

U 

(1) R^(t)AR^(0) c jT-j<|j.| S(t), 


(11) [R^(t)ARj^(0) ^ g(t) £ {y(x,T)e |t| ^ [t| and x e SCOOm^ ^ 

I) 

(ill) ^y(x,T) < |t| and x c ^ Wp,q,2r,® ^ - |t | <| 1 1 ' 


Proof : 

(i) Suppose that t > 0 and x e Rj^(t) - Rj^(O). Set = Inf { t;x€R^(t) - Rj^(O)}. 

Clearly, x e S(tQ>. The other cases follow similarly. 

(11) If w £ [R. (t)AR (0) ]p\M o(t)> then w e S(t-) for some t>, 

UqI ^ |tj. If t and are small, or if t is small and 0 < p < y ~ q, 

then, by Lemma 2, y(W,T) exists for |x| 5 |t|. In particular, x = y^,0) 

satisfies y(x,tQ) =w^ Now, by the remarks after Lemma 2, 

2 

|x-wj < K |t|^ ^ for a constant K Independent of p,q,r,s, 

X.*t*lT 

and t. If and t are sufficiently small, or If t is small and 

8 

t — 

0 “ P y - q, then one sees that |x-w| < r |t|“^ and [x-w | < y 

Consequently, x e S(0)nM _ s (t) . 

P » 2r , 2 

(ill) If X e S(0)riM ^ £ (t) , then, as in the proof of (11) , one uses an 

P » n > * 2 2 

inequality |y(x,x) - x| < to obtain y(x,x) e M ^ s (t) 

® P*q»3r,y 

for lx I < |t{, if t and are sufficiently small, or if t is 

S 

1 
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Proof of the theoremi 


One has 


P (t) - P (0) = 
e e 




Kl(t) 


R2<t) 


Rj^(O) 


R2(0) 


X 


G(x)dx ~ 




G(x)dx, 


R^(t)-R^(0) 


Rj^(O)-R^Ct) 


where G(x) * (a^^P^Cx) - Thus 


iP (t) -- P (0) 5 

e e 


X 


|g(x) | dx. 


Rj^(t)ARj^(0) 


and, for given p,q,r, and s, we obtain 


(2) !?g (t) - Pg (0)1 S JigI + J|g| + J|g| 

(R2(t)4R2(0)]nL(t) (R^(t)AR^(0)]nM(t) [Rj(t)4R^(0) jDlKt) 


We consider the following cases: 

(1) \7F(x,0) never vanishes on S(0)^ 

(2) VF(x,0) vanishes somewhere on S(0) and m > 1, 

(3) 7F(x,0) vanishes somewhere on S(0) and m * 1. 

Case 1: First, the following lemma is needed, 

1 

Lemma 4; Suppose VF(x,0) 0 on S(0) and 0 ^ p < j , 


If t and s are sufficiently small, then S(t) 0 N(t) “ 0 for |t| < |t| 
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Proof ; Suppose that the lemma Is false. Then there exist sequences 


{t.}, {t.}, and {x,} with s. 
J J j J 


0, tj^^O, |Xj| < jtjl, and 


Xj £ S(Tj) n N(tj). Note that 7F(Xj ,Tj ) — ►O, since |V f(Xj .T^ ) | S Ks^ 1 1 | ■*. 

Now F(x,t) - x'^A(t)x + B(t)x + C{t), where A(t) = j (E^^Ct) - E~^(t)) 
is symmetric and A(t) and B(t) are continuously differentiable near t - 0. 
Denote by 3?(A(0)) the null-space of A(0). Writing A(t) “ A(0) + Oj^(t), 

B(t) = B(0) + 02 ( 1 ) > and Xj = where y^ €yi(A(0)/’ and z eH(A(0)> 

one obtains 


0 = J^F(Xj.Tj) = ji^{2A(0)yj + 0j(T^)Xj + 

Since jxjj < r it^r’^ and 0 ^ p < y, It follows that 

7j (A(O)f^ , and 

0 = 2A(0)y* + BCO)"^ = VF(y*,0). 

T X 

Note that this equation implies that B(0) (A(0)) . Furthermore, we have 

0 - FCx^.'C,) = y^^A(0>y^ + Xj\ct^)Xj + B(0)yj + + C(x^). 

AS before, ^ ^ 

0 , Consequently, 

0 “ y*^A(0)y* + B(0)y* + C(0) - F(y*,0). 


J 
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This contradicts the assumption that VF(x,0) never vanishes on S(0), and 
the proof Is complete. 

Using Lemma 4, we obtain from (2) that 

(3) |Pg (t) - p ^(0)1 S J |g| + J |g| 

i L(t) [Rj(t)aRj^(0)]fl M(t) 

1 1 
for 0 p < -j, if t and s are sufficiently small. If 0 < p < y, then 

the first integral on the right-hand side of (3) approaches zero faster than 

any power of t as t approaches zero. In addition, we have the following 

proposition. 

1 1 

Proposition 1: Suppose that 0 ^ q ^ and 0 ^ P ^ "2 ” 

If t and — are sufficiently small, or if t is sufficiently small and 
s 

1 

0 - then 

/ |Gi S K 
[R^(t)AR^(0)]nM(t) 


where the constant K is independent of q. 


Proof: It follows from Lemma 3 that if t and 


1+r 


or if c is sufficiently small and 0 <p 


are sufficiently small; 


then 


I 


S J 


||^(y(x,T),T) j 

lc(y (x,t))| ~dS(T)dT 




T|<|t] S(0)flM , a(t) |VF(yCx.T).T)| 

p,q,2r,-2 
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J J |G(x) 

TNit! ^ s(t)Am . R 


S(t)/^M s. V 

P.q,3r,j(t) 


F(x,t) | 
I VF(x,t) I 


dS(T) 


dT, 


where dS(x) is the element of surface area on S(t). (See Spivak [3) for a 
discussion of integration on manifolds.) It is easily seen that 


i 


|g(x)| 


3 

3t 


F(x,T)(dS(T) 


is bounded for |t| < |t| uniformly for fc near zero. Furthermore, for fixed 

s. Lemma 1 implies that | VF(x,x) | ^ K 1 1 1 for X£S(T)nM , s(t) and 

II II P.q.3r,j 

)Tj < (t|. Consequently* 

J |g| s k |t|^"'i. 

lRj(t)AR^(0)mMp^^_^^^(t) 

It is easily verified that K is independent of q. and the proof is complete. 

From Proposition 1 and the preceding remarks, one sees that if 0<p<i,q = 0, 
and s is sufficiently small, then 


K P <0)1 ^ K |t| 


iOr small c, and the theorem is proved in this case 
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Case 2: If 0 < p, then, as before, the first Integral on the right-hand 

side of (2) approaches zero faster than any power of t as t approaches 

i 

zero. In addition. Proposition 1 remains valid. Thus, if 0 q < y, 

1 1+r 

0 < p < — - q and is sufficiently small, then the second integral on 

^ s 

l'~Q 

the right-hand side of (2) is bounded by K|t| ^ as t approaches zero, 
where the constant K is independent of q. Of course, S(T)rtN(t) ^ 0 for 
all T, |t| £ |t|> ond we need Che following proposition. 

Proposition 2 i There exists a constant K., independent of q, for which 

; m c 

|G| S K |tl , 

N(t) 

where m is the number of non-zero eigenvalues (counting multiplicity) of 
2 ‘^( 0 ) - 

Proof: We have F(x,t) =» x\(t)x + B(t)x + C(t), where A(t) *» , 

and VF(x,t) = 2A(t)x + B(t)^. Denoting the ball of radius p about the origin 

in by B , one sees that, if x^ is any solution of VF(x^,0) = 0, then 

P u 0 

N(t) £ j Xq + y + z:y cjTt (A(0)), z € (A(0))*^ H I “ 

Now the dimension of ^(A(0)) is equal to the number of non-zero eigenvalues 
(counting multiplicity) of A(0) - “(^^^(O) - £~^C0))> Denoting this dimension 
by m, we obtain (with a slight abuse of notation) 
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J \0\ ^ J U |gCx^ + y + z)-jdy I dz 

N(t) 2flf(A(0)) n Bgitl'i yt(A(0)) 


< K t 


\m 


for an appropriate constant K* independent of q, and the proof is coniplete. 

From the above discussion, one sees that the best rate of decay of the 

right-hand side of (2) is obtained by choosing p and q such that 
1 1 

and 1 - q « mq. Since m > 1, a compatible choice 

^ ^ arfl 0 < p < 2(ntfl) “ yields the desired inequality 

m 

m+1 

- P^(0)| ^ K |t| 


3: In this case, one sees that S(0) is an (n-1) -dimensional hyperplane 

in R“ and that VF(x,0) = 0 If and only if x € S(0)« By performing a 

translation of co-ordinates followed by a unitary transformation on if 

necessary, we may assume that S(0) = {x « (0,x^,...,x t Then 

T 

F(x,0) “ X Ax, where A has a non— zero entry in the upper left-hand corner 
and only zero entries elsewhere, i.e., F(x,0) « Xx^. We will use the sets 

following with p - 0 and 

1 

q = y, and we set 


K^(t) “ {x “ (xj^,. . .,x^)^ c |x^| S c /[7f + 


a « A 
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Proposition 3: If ■ is sufficiently small and c Is sufficiently Iarge» then 

R^(t)AR^(0) ^ N(t) U K^(t) whenever t is small. 

This proposition follows from the two lemmas below. 

Lenona 5: If is sufficiently small, then M(t)r\ [R. (t)AR- (0) ] = 0 whenever 

' ” S XX 

t is small. 

Proof : In M(t), F(x,0) > X s^jtj and |f^(xj,T)| K(l+r)^ for |x| < |t|y 

where the constant K is independent of r,s, and t for small t. So, for 

Ilf! !t|. 

[F<x,t)| £ ]X - K(l+r)^||t| 

1+r /X~ 

in H(t) whenever t is small. If < f ^ , one sees that F(x,x) ^ 0 

in M(t) for |t| s |t|. since R,(t)AR,(0)C U S(t), the lemma follows. 

" " M^ltl 

Lemma 6 : Suppose that r is given. If C Is sufficiently large, then 

L(t)0 [R^(t)AR^(0) ] ^ K^(t) whenever t is small. 

^^^oof » In L(t), F(x,0) * Xx^ and |f^(x,t)( < K |x|^ for |x| ^ |t|, where 

the constant K depends on r but is Independent of t for small t« So, 

for [ T j < j t j , one has 


[F(x,x)| S (X-K|t|)x^ 


in L(t), If X t L(t)-K (t), then 

c 



> 


n 




2 

i 



If c is sufficiently large, then the right-hand side is positive for small 
t. Consequently, 


[L(t)-K (t)]A[R.(t)AR,(0)] c [L(t)-K (t)]n[ U SCt)]-0, 


and the proof is complete. 

4i y 

From Proposition 3, one sees that if is sufficiently small and c 

is sufficiently large, then 


1^^ (t) - Pp (0)1 6 J |g| + J |g| 

N(t) 


K (t) 


for small t. The two Integrals on the right-hand side are easily seen to be 
bounded by k/| t f, and the proof of the theorem Is complete. 
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